Motivation

The 2016 General Election for President of the United States was our very first general election in which we were able to vote. Many pundits, politicians, and citizens alike found the contest between Hillary Clinton and Donald Trump to be one of the most bitter presidential races in recent memory, and many more were surprised by the outcome.

Most pollsters and statisticians had Hillary Clinton maintaining a comfortable margin before the election. Yet, Donald Trump emerged victorious as the 45th President of the United States. Our goal is to try to explain why.

We have incorporated US Census demographic data, as well as employing comparisons between Donald Trump’s 2016 victory with Barack Obama’s 2012 re-election against Mitt Romney.

What You Will Find Below

Below are primarily the visualizations generated from our analysis. They were created via leaflet and ggplot2 in R. Our 2012 election data was obtained from the Huffington Post, and our 2016 election data was sourced from Kaggle.

Code for the graphics is not included in chunks below. Please consult our Github for more information, including our original graphics and code.

Our analysis excludes Alaska and Hawaii as well as third-party candidates.


2016 Election

Margin of Victory

Let’s take a look at the 2016 electoral map, divided by county.

Gradient Map

The gradient within the map represents the margin of victory, from 60% or greater margin of victory for Trump in red to 60% or more margin of victory for Clinton in blue.

Naturally, Clinton has a visible edge in more urban, coastal areas while Trump has an advantage in more rural areas.


Pure Win-Loss

We can clarify the win by county by removing the gradient.


Interactive

Let’s check out a few interactive maps created with leaflet. Clinton narrowly won Virginia while Trump narrowly won Pennsylvania. Clicking on the county will give you its name, Trump’s margin of victory, as well as each candidate’s respective number of votes (and share of the vote).

Virginia

Pennsylvania

United States


2012 Election

Margin of Victory


Pure Win-Loss

Like the procedure above, we can remove the gradient and look at the winner by county.

It is certainly quite evident that Obama was able to win a significantly greater number of counties in 2012 compared to Clinton in 2016. We will run through some exact comparisons below.


2012 and 2016 Comparisons

Map Comparisons

Let’s compare the win-loss maps by county from 2012 to 2016.

2012

2016

As evidenced in the maps, Obama was much more successful than Clinton in the Northeast and Midwest (note that these graphs do not account for turnout, so we’re only able to measure strict win-loss). We’ll explore these areas further in the maps below.


2016 Flipped Counties

How successful were each of the candidates in flipping counties from their previous victor? Below is a graph of the 2016 electoral map, with flips highlighted in yellow for Trump (Trump winning a county won by Obama in 2012) and green for Clinton (Clinton winning a county won by Romney in 2012).

We can clean up the colors by marking the unchanged counties in grey and re-marking the flipped counties for clarity.

Table of Flipped Counties by State

We can further drill down and take a look at the geographic breakdown of flipped counties by state.

Trump
Clinton

As evidenced from the tables above, Trump was much more successful than Clinton in flipping counties from 2012 to 2016, flipping 223 counties won by Obama in 2012 compared to Clinton’s 17 counties won by Romney in 2012.


Interactive

Given the information from above, it looks as if Clinton lost significant numbers of counties in the Northeast and Midwest that were once held by Obama. Let’s explore in greater detail

Northeast

Midwest

Voter Turnout Comparisons

Given the voting information for 2012 and 2016, we can also construct a representation of voter turnout from 2012 to 2016. As a side note, the very large decrease was from Los Angeles County for both parties.

2012-2016, Total Turnout

2012-2016, Republican Party Turnout

We can compare the 2012 and 2016 turnouts for just the Republican Party. Note the interesting pockets of lower-than-expected turnout in Arizona, Utah, and Washington.

2012-2016, Democratic Party Turnout

We can do the same for the Democratic Party. Note that the disparity in LA County is much more pronounced for the Democrats than the Republicans.


Hypotheticals

Even though we understand that the circumstances surrounding both the 2012 and 2016 elections were different, how well could Obama have performed in the 2016 election had he gone against Trump and maintained the same performance that he did in 2012?

Interactive

Northeast

Midwest

Head-to-Head Hypothetical Matchup

Here’s what the US county electoral map would have looked like if we matched Obama’s 2012 performance to Trump’s 2016 victory.


Machine Learning and Predictive Models

Variable Importance, 2012

Let’s take a look at variable importance with regards to US Census data (per-county basis) for 2012. Note the high importance of the attribute measuring the proportion of the county that was considered ‘Asian.’


Variable Importance, 2016

We can perform the same importance analysis for 2016.


Conditional Inference Tree

We can build a conditional inference machine learning model to predict county-level outcomes on the 2012 data. In this case, our accuracy was 86.4% on our test data set (separated from training data). As a note, accuracy is measured by summing the top-left cell and the bottom-right cell (along the lower-right diagonal).

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  624 
## 
##  
##              | Predicted Type 
##  Actual Type |         0 |         1 | Row Total | 
## -------------|-----------|-----------|-----------|
##            0 |       460 |        16 |       476 | 
##              |     0.737 |     0.026 |           | 
## -------------|-----------|-----------|-----------|
##            1 |        69 |        79 |       148 | 
##              |     0.111 |     0.127 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |       529 |        95 |       624 | 
## -------------|-----------|-----------|-----------|
## 
## 

We can use this model on 2016 data as well. Here, our accuracy decreased on testing data to 71.65%.

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  624 
## 
##  
##                 | Predicted Type 
##     Actual Type |         0 |         1 | Row Total | 
## ----------------|-----------|-----------|-----------|
##    Donald Trump |       437 |        85 |       522 | 
##                 |     0.700 |     0.136 |           | 
## ----------------|-----------|-----------|-----------|
## Hillary Clinton |        92 |        10 |       102 | 
##                 |     0.147 |     0.016 |           | 
## ----------------|-----------|-----------|-----------|
##    Column Total |       529 |        95 |       624 | 
## ----------------|-----------|-----------|-----------|
## 
## 

Sequential Covering and Rule-based Learning

We used a rule-based method (jRip) to model the 2012 data. Here, our accuracy in predicting county-level outcomes was 85.3%. We’ve also listed the rules below. The general syntax is rule-set => prediction (# of instances correctly classified by rule, # of instances misclassified by rule). Our successes with rule-based sequential covering methods (compared to conditional inference trees) suggests a lower level of complexity with regards to individual voting decisions.

## JRIP rules:
## ===========
## 
## (p_white <= 53.5) and (p_hisp <= 29.3) and (p_black >= 46.6) => ObamaWin=1 (88.0/1.0)
## (p_asian >= 1.3) and (pop_sqr_mile2010 >= 446.7) and (p_white <= 68.6) => ObamaWin=1 (106.0/18.0)
## (p_asian >= 1.1) and (pop_sqr_mile2010 >= 282.7) and (p_female >= 51.4) => ObamaWin=1 (56.0/18.0)
## (p_bachelors >= 25.4) and (p_asian >= 3.4) and (pop_65+ >= 13.9) => ObamaWin=1 (36.0/8.0)
## (p_bachelors >= 28) and (p_hisp <= 2.2) => ObamaWin=1 (33.0/12.0)
## (p_asian >= 0.9) and (p_white <= 54.9) and (p_bachelors >= 15.9) => ObamaWin=1 (47.0/13.0)
## (p_bachelors >= 28) and (p_bachelors >= 39.4) and (pop_total_2010 <= 23509) => ObamaWin=1 (16.0/2.0)
## (p_white <= 31.5) => ObamaWin=1 (33.0/7.0)
##  => ObamaWin=0 (2074.0/209.0)
## 
## Number of Rules : 9
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  624 
## 
##  
##              | Predicted Type 
##  Actual Type |         0 |         1 | Row Total | 
## -------------|-----------|-----------|-----------|
##            0 |       448 |        28 |       476 | 
##              |     0.718 |     0.045 |           | 
## -------------|-----------|-----------|-----------|
##            1 |        64 |        84 |       148 | 
##              |     0.103 |     0.135 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |       512 |       112 |       624 | 
## -------------|-----------|-----------|-----------|
## 
## 

Interestingly, when we apply this model trained on 2012 data to the 2016 data, our accuracy on test data skyrockets to 93.5% (we’re able to correctly predict county-level outcomes 93.5% of the time)

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  624 
## 
##  
##                 | Predicted Type 
##     Actual Type |         0 |         1 | Row Total | 
## ----------------|-----------|-----------|-----------|
##    Donald Trump |       504 |        18 |       522 | 
##                 |     0.808 |     0.029 |           | 
## ----------------|-----------|-----------|-----------|
## Hillary Clinton |        23 |        79 |       102 | 
##                 |     0.037 |     0.127 |           | 
## ----------------|-----------|-----------|-----------|
##    Column Total |       527 |        97 |       624 | 
## ----------------|-----------|-----------|-----------|
## 
## 
## JRIP rules:
## ===========
## 
## (p_white <= 47.1) and (p_indian <= 1) => lead=Hillary Clinton (132.0/15.0)
## (p_bachelors >= 27.3) and (p_asian >= 3.5) => lead=Hillary Clinton (135.0/36.0)
## (p_white <= 54.9) and (pop_sqr_mile2010 >= 48.9) and (pop_sqr_mile2010 >= 303.8) => lead=Hillary Clinton (18.0/0.0)
## (p_white <= 54.1) and (p_white <= 33.3) and (p_white <= 23.3) => lead=Hillary Clinton (15.0/1.0)
## (p_bachelors >= 25.6) and (pop_sqr_mile2010 >= 619.7) and (pop_sqr_mile2010 >= 1115.3) => lead=Hillary Clinton (19.0/4.0)
## (p_bachelors >= 28) and (p_bachelors >= 39.4) => lead=Hillary Clinton (32.0/11.0)
## (p_bachelors >= 27.3) and (p_hisp <= 2.2) and (p_black <= 0.7) => lead=Hillary Clinton (14.0/4.0)
## (p_white <= 54.9) and (pop_sqr_mile2010 >= 48.9) and (median_Income <= 34285) => lead=Hillary Clinton (9.0/0.0)
## (p_bachelors >= 21.3) and (p_hisp >= 23.5) and (p_indian >= 2.1) and (pop_sqr_mile2010 >= 4.7) => lead=Hillary Clinton (10.0/0.0)
## (p_white <= 52.3) and (p_hisp <= 10.1) and (median_Income >= 36022) and (pop_total_2010 >= 7065) => lead=Hillary Clinton (12.0/3.0)
## (p_bachelors >= 18.2) and (p_white <= 36) => lead=Hillary Clinton (6.0/0.0)
##  => lead=Donald Trump (2087.0/57.0)
## 
## Number of Rules : 12

If we train a model and test it with purely 2016 data, interestingly, our accuracy decreases to 92%.

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  624 
## 
##  
##                 | Predicted Type 
##     Actual Type |    Donald Trump | Hillary Clinton |       Row Total | 
## ----------------|-----------------|-----------------|-----------------|
##    Donald Trump |             498 |              24 |             522 | 
##                 |           0.798 |           0.038 |                 | 
## ----------------|-----------------|-----------------|-----------------|
## Hillary Clinton |              26 |              76 |             102 | 
##                 |           0.042 |           0.122 |                 | 
## ----------------|-----------------|-----------------|-----------------|
##    Column Total |             524 |             100 |             624 | 
## ----------------|-----------------|-----------------|-----------------|
## 
## 

kNN

We also used a k-nearest-neighbor algorithm for 2012 and 2016 data. With 2012 data, we were able to achieve an accuracy of 86.8%.

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  623 
## 
##  
##                 | prc_test_pred 
## prc_test_labels |         0 |         1 | Row Total | 
## ----------------|-----------|-----------|-----------|
##               0 |       465 |        11 |       476 | 
##                 |     0.977 |     0.023 |     0.764 | 
##                 |     0.868 |     0.126 |           | 
##                 |     0.746 |     0.018 |           | 
## ----------------|-----------|-----------|-----------|
##               1 |        71 |        76 |       147 | 
##                 |     0.483 |     0.517 |     0.236 | 
##                 |     0.132 |     0.874 |           | 
##                 |     0.114 |     0.122 |           | 
## ----------------|-----------|-----------|-----------|
##    Column Total |       536 |        87 |       623 | 
##                 |     0.860 |     0.140 |           | 
## ----------------|-----------|-----------|-----------|
## 
## 

Using 2016 training and testing data, our accuracy on a county-level basis increased to 93.57%.

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  623 
## 
##  
##                 | prc_test_pred 
## prc_test_labels |    Donald Trump | Hillary Clinton |       Row Total | 
## ----------------|-----------------|-----------------|-----------------|
##    Donald Trump |             506 |              15 |             521 | 
##                 |           0.971 |           0.029 |           0.836 | 
##                 |           0.953 |           0.163 |                 | 
##                 |           0.812 |           0.024 |                 | 
## ----------------|-----------------|-----------------|-----------------|
## Hillary Clinton |              25 |              77 |             102 | 
##                 |           0.245 |           0.755 |           0.164 | 
##                 |           0.047 |           0.837 |                 | 
##                 |           0.040 |           0.124 |                 | 
## ----------------|-----------------|-----------------|-----------------|
##    Column Total |             531 |              92 |             623 | 
##                 |           0.852 |           0.148 |                 | 
## ----------------|-----------------|-----------------|-----------------|
## 
## 

Prediction Mapping

We can map the result of our jRip model and compare it to the actual 2016 result.

## JRIP rules:
## ===========
## 
## (p_white <= 52.3) and (p_hisp <= 28) => ObamaWin=1 (230.0/31.0)
## (p_asian >= 1.3) and (pop_total_2010 >= 266929) and (pop_total_2010 >= 735332) => ObamaWin=1 (54.0/6.0)
## (p_asian >= 1.3) and (p_bachelors >= 27.7) and (pop_65+ >= 13.5) and (p_asian >= 2.7) => ObamaWin=1 (69.0/18.0)
## (p_bachelors >= 25.1) and (p_bachelors >= 38.7) and (median_Income <= 72745) => ObamaWin=1 (64.0/18.0)
## (p_asian >= 1.1) and (pop_sqr_mile2010 >= 492.6) and (p_HS <= 88.8) and (pop_sqr_mile2010 >= 1115.3) => ObamaWin=1 (16.0/2.0)
## (p_asian >= 1.1) and (pop_total_2010 >= 153920) and (p_white <= 59.5) => ObamaWin=1 (33.0/12.0)
## (p_white <= 55.7) and (p_white <= 32) => ObamaWin=1 (33.0/7.0)
## (p_HS >= 86.5) and (pop_sqr_mile2010 >= 16.6) and (p_HS >= 89.6) and (p_hisp <= 2.2) and (p_asian >= 0.6) and (median_Income <= 57565) and (p_bachelors >= 25.9) and (pop_sqr_mile2010 <= 199) => ObamaWin=1 (22.0/1.0)
##  => ObamaWin=0 (2591.0/266.0)
## 
## Number of Rules : 9
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  3112 
## 
##  
##                 | Predicted Type 
##     Actual Type |         0 |         1 | Row Total | 
## ----------------|-----------|-----------|-----------|
##    Donald Trump |      2500 |       125 |      2625 | 
##                 |     0.803 |     0.040 |           | 
## ----------------|-----------|-----------|-----------|
## Hillary Clinton |        91 |       396 |       487 | 
##                 |     0.029 |     0.127 |           | 
## ----------------|-----------|-----------|-----------|
##    Column Total |      2591 |       521 |      3112 | 
## ----------------|-----------|-----------|-----------|
## 
## 

2016 Prediction Map